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<*"] ' This paper describes design of a low-complexity algorithm for adaptive en- 

coding/decoding of binary sequences produced by memoryless sources. The 
algorithm implements universal block codes constructed for a set of contexts 
identified by the numbers of non-zero bits in previous bits in a sequence. 

We derive a precise formula for asymptotic redundancy of such codes, which 
refines previous well-known estimate by Krichevsky and Trofimov |24j . and 
provide experimental verification of this result. 

In our experimental study we also compare our implementation with existing 
binary adaptive encoders, such as JBIG's Q-coder gT], and MPEG AVC (ITU- 
T H.264)'s CABAC [43] algorithms. 

^ ■ 1 Introduction 

One of the most basic tasks in the design of today's data compression algorithms 



is the one of converting input sequences of bits with some unknown distribution 
into a decodable bitstream. This happens, for example, in the design of image or 
video codecs, scalable (bit-slice based) encoding of spectrum in audio codecs, etc. In 
most such cases, the bits to be encoded are taken from values produced by various 
signal processing tools (transforms, prediction filters, etc), which means that they 
are already well de-correlated, and that assumption of memorylessness of such a 
source is justified. 

Most commonly, the problem of encoding of such sequences of bits is solved by 
using fast (typically multiplication-free) approximations of binary adaptive arith- 



metic codes. Two well known examples of such algorithms are IBM's Q-coder |41| 
adopted in JBIG image coding standard [32], and CABAC encoder [33] used in 
MPEG AVC/ITU-T H.264 standards for video compession [44] . 

In this paper we describe an alternative implementation of adaptive encoder 
using an array of Huffman codes designed for several estimated densities, indexed 
by the numbers of non-zero bits in previous blocks (contexts) in a sequence. 

We study both efficiency and implementation aspects of such a scheme and show 
that by using even relatively short blocks (8. ..16 bits, and using correspondingly 
1.5...5K bytes of memory) it can achieve compression performance comparable or 
superior to one of the above quoted algorithms. 

This paper is organized as follows. In Section 2 we provide background informa- 
tion about our coding problem. In Section 3 we quote known results about efficiency 
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of such codes and offer a more precise result. In Sections 4 and 5 we describe de- 
sign of our system, and in Section 6 we provide experimental results. Appendix A 
contains proofs of out Theorem 1, and Appendix B contains complete code of the 
program we've designed. 



2 Background Information 

Consider a memoryless source producing symbols from a binary alphabet {0, 1} with 
probabilities p, and q = 1 — p correspondingly. If w is a word of length n produced 
by this source, then its probability: 

Pv(w)=p k q n - k , (1) 

where k denotes the number of l's in this word (sometimes k is also referred to as 
weight of w). 

A block code 4> is an injective mapping between words w of length \w\ = n and 
binary sequences (or codewords) 4>(w): 

<t>: {0,1}" ^{0,1}* , (2) 

where the codewords cf>(w) represent a uniquely decodable set [7J. 

Typically, when the source (i.e. its probability p) is known, such a code is de- 
signed to minimize its average length, or (in relative terms) its average redundancy: 

R <t> M = \ Pr H I^HI - H(p) . (3) 

\w\=n 

As customary by H(p) = —plogp — qlogq we denote the entropy of the source [7J. 

Classical examples of codes and algorithms suggested for solving this problem 
include Huffman [18] , Shannon [33], Shannon- Fano [12] . Gilbert-Moore |17] codes 
and their variants pQ. Performance of such codes is well studied, see, e.g. [IB], [2"5] . 
[35] , [36] , [30J . Analysis of their complexity can be found in [38] , [39] . Many useful 
practical implementation techniques for such codes have also been reported, see, 

e.g. n, mm- 

When the source is not known, the best option available is to design a universal 
code 4>* that minimize the worst case redundancy for a class of sources [131 [H [25] : 

R<j>* (n) = inf sup (n,p) . 

An example of such a code can be constructed using the following estimates of words' 
probabilities^: 

r(fc + l/2) r(n-fc + l/2) 

PKT h = ' (4) 

where T(x) is a T-function, k is the weight of word w, and n is its length. 

Finally, we might be in a situation when exact value of parameter of the source 
is not known, but we can access a sequence of symbols u produced by this source in 



1 This formula is due to Krichevsky and Trofimov [24], and it assures uniform (in p) convergence 
to true probabilities with n — > oo. See [26] and [40] for discussions on its background and optimality. 



2 



the past. We will call such a sequence a sample, and will assume that it is \u\ = t 
bits long. The task here is to design a set of codes (indexed by different values of 
this sample) 0* , such that their resulting worst case average redundancy is minimal: 

r<pi ( n ' *) = r inf , su p y2 ft ( u ) r 4>u ( n ,p) ■ ( 5 ) 

\u\=t 



Such codes are called sample-based or adaptive universal block codes 

In this paper we will study a particular implementation of adaptive block codes 
utilizing the following estimates of probabilities of words w given a sample u: 

_ P KT (uw) _ T(k + s + 1/2) r (n + t - k - s + 1/2) T (t + 1) 

fKTWu)- Pkt{u) - r( s + i/2)r(t- s + i/2) r(n + i)' w 

where s is the weight of a sample u, and t is its length. 



3 Performance of Adaptive Block Codes 

The idea and original analysis of sample-based codes utilizing estimator ([6]) belong 
to R. E. Krichevsky [23J. In particular, he has shown (cf. [24, Theorem 1], |25[ 
Theorem 3.4.1]), that the average redundancy rate of an adaptive block code is 
asymptotically 

1 n + t 

where n is a block size, and t is the size of samples. 

From ([7]) it is clear, that by using samples of length t = 0{n) it is possible to lower 
redundancy rate of such codes to O (-) , which matches the order of redundancy rate 
of block codes for known sources [25l [36] . 

However, in order to be able to understand full potential of such codes it is desir- 
able to know more exact expression for their redundancy, including terms affected 
by the choice of actual code-construction algorithm (such as Huffman, Shannon, 
etc). The theorem belowl! offers such a refinement. 

Theorem 1 (Reznik & Szpankowsky 2003) The average redundancy rate of an 
adaptive block code <$>* u has the following asymptotic behavior (n,t — » oo): 

R<t>* u (n,t,p) = ^2 Pr(-u)i? » (n,p) 

\u\=t 

ljl t + n 1 — Apq n 1 — 3 pq (n + 2t) n 

n\2 g ~T + ^ {U,t,P ' + 24pq t (t + n) ~ 24pV t 2 (t + n) 2 

where n is a block size, and t is a sample size, p, q = 1 — p are probabilities of symbols 
of the input source, and where 

A K (n,t,p) = E Pr(«)Pr(«0 [\K(w)\ + log P KT (w\u)] (9) 

u|=i |-u;|=n 

is the average redundancy of code 0* with respect to estimated distribution (0). 
2 This is a simple generalization of our previous result for adaptive Shannon codes [29] . 
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Figure 1: Behavior of a factor X 2 tyq m redundancy expression ([8]). 

The exact behavior of A^* (n,t,p) is algorithm-specific, but for a large class of 
minimum-redundancy techniques, which includes conventional Huffman and Shan- 
non codes, we can say that this term is bounded in magnitude 

\A(n,t,S)\ < 1, 

and that it exhibits oscillating behavior, which may or may not be convergent to 
some constant depending on the value of the parameter p (cf. [36], [2], [29]). 

We also notice, that for short t and n the redundancy of such codes becomes 
affected by the next following term: 

1 — Apq n 
2ipq t(t + n) 

which is a function of the parameter of the source p. We plot leading factor of this 
term in Figure 1, and conclude that for short blocks/samples performance of such 
codes becomes sensitive to the asymmetry of the source. 

Proof of this theorem can be found in Appendix A, and the rest of paper is 
devoted to study of efficient algorithms for implementing such codes. 

4 Efficient Implementation of Block Codes 

We first notice, that in a memoryless model the probability of a word w (or its 
estimate, cf. (P), Q, Q) depends only on its weight k, but not an actual pattern 
of its bits. Hence, considering a set of all possible n-bit words, we can split it in 
n+1 groups: 

{0, l} n = W nfi UW nA U...UW n<k U...U W n>n , (10) 

containing words of the same weight {k = 0, ... ,n), and the same probability. As 
obvious, the sizes of such groups are |W ni fe| = (^). For further convenience, we will 
assume that each group W n ^ stores words in a lexicographic order. By I n) k{w) we 
will denote the index (position) of a word w in a group W n ^. 

To describe the structure of our proposed mapping between words in groups 
W n k and their codewords, we will use an example code shown in Table 1. This 
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Table 1: Example of a code constructed for 4-bit blocks with Bernoulli probabilities: 
p k q n ~ k , p = 0.9. 



Block w 


k 


I n .k{w) 


Pt(w) 


Length 


Code 4>(w) 


Sub-group 


0000 








0.6561 


1 


1 





0001 


1 





0.0729 


3 


001 


1 


0010 


1 


1 


0.0729 


3 


010 


1 


0011 


2 





0.0081 


6 


000011 


3 


0100 


1 


2 


0.0729 


3 


011 


1 


0101 


2 


1 


0.0081 


7 


0000001 


4 


0110 


2 


2 


0.0081 


7 


0000010 


4 


0111 


3 





0.0009 


9 


000000001 


5 


1000 


1 


3 


0.0729 


4 


0001 


2 


1001 


2 


3 


0.0081 


7 


0000011 


4 


1010 


2 


4 


0.0081 


7 


0000100 


4 


1011 


3 


1 


0.0009 


9 


000000010 


5 


1100 


2 


5 


0.0081 


7 


0000101 


4 


1101 


3 


2 


0.0009 


9 


000000011 


5 


1110 


3 


3 


0.0009 


10 


0000000001 


6 


1111 


4 





0.0001 


10 


0000000000 
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code was constructed using a modification of Huffman's algorithm |18| . in which 
additional steps were taken to ensure that codewords located at the same level have 
same lexicographic order as input blocks that they represent. It is well-known that 
such a reordering is possible without any loss of compression efficiency, and examples 
of prior algorithms that have been using this idea include Huffman-Shannon-Fano 
codes [5], canonic codes of Moffat and Turpin [27], [4], etc. 

In Figure 2 we depict the structure of this code. As expected, each group W n ^ 
consists of at most two sub-groups containing codewords of the same length: H 

W n , k = W n ,k,e U W n ,k,£+i , (11) 

where I is the shortest code length that can be assigned to blocks from W n k . More- 
over, since words within W n k group follow lexicographic order, then the split be- 
tween W nj k,t and W n> k,e+i is simply: 

W n>k) i = {we W n ^ k : I n ,k{w) < n k } , (12) 
W n>k ,i+i = {we W n , k : I n ,k{w) ^ n k } , (13) 

where nk denotes the size of a subgroup with shorter codewords. 

We will call lexicographically smallest codewords in each subgroup base code- 
words: 

B n> k,e = 4>(wq) , (14) 
B n ,k,e+i = <P( w n k ), (15) 

3 This follows from the fact that all words in W„,k have the same probability, and so-called sibling 
property of Huffman codes (cf. [16], [35] . [36]). This observation also holds true for Generalized 
Shannon codes [10] and possibly some other algorithms. 



5 




Figure 2: Structure of an example block code. 



where Wi : is i-th block in W n k , and note that the remaining codewords in both 
subgroups can be computed as follows: 



4>(wi) 



B n ,k,e + i, if i < n k , . . 

B n ,k,e+i + i ~ n k , if i > n k . 



We point out that such base codewords are only defined for non-empty sub- 
groups, and that the number of such subgroups S in a tree constructed for ra-bit 
blocks is within: 

n + l^S^2n. (17) 

We also notice that multiple subgroups can reside on the same level @ (see e.g. 
level 10 in tree in Figure 2), and the number of such collocated sub-groups cannot 
be greater than n + 1. 



4.1 Proposed Algorithm for Block Encoding/Decoding 

Based on the discussion above we can now define a simple algorithm for direct 
computation of block codes. 

We assume that parameters n k (0 ^ k ^ n) are available, and that for each 
non-empty sub-group we can obtain its level £ and its base codeword B nk g. Then 
the process of encoding a block w is essentially a set of the following steps: 

4 This is one of the most obvious differences between our algorithm and Connell [5], or Moffat 
and Turpin 27 algorithms, which assign unique base codewords for each level, but then they need 
an 0(n 2 n )-large reordering table to work with such codes. Here, the entire structure is 0(n 2 ) bits 
large. Also, unlike [5S], |39| our algorithm does not assume any particular order of probabilities 
based on weight k. This way we can implement codes for universal densities (|4}, and 
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Algorithm 1 Direct construction of block codes. 



/* encoder structure: */ 
typedef struct { 



unsigned short nk[N+l]; 
unsigned char sg[N+l] [2] ; 
unsigned char len[S]; 



/* # of elements in first (n,k) subgroup */ 

/* (k,j) -> subgroup index mapping */ 

/* subgroup -> code length mapping */ 

/* subgroup -> base codeword mapping */ 



unsigned int base [S] ; 
> ENC; 



/* block encoder: */ 

unsigned block_enc (unsigned w, 

{ 



ENC *enc, BITSTREAM *bs) 



unsigned i, j, k, len, code; 



k = weight (w) ; 

i = index (n,k,w) ; 

if (i >= enc->nk[k] ) { 



/* split w into (k, index) */ 



/* find subgroup containing w */ 
/* adjust index */ 



i -= enc->nk [k] ; 



j = enc->sg[k] [1] ; 
} else 



j = enc->sg[k] [0] ; 
code = enc->base[j] + i; 
len = enc->len[j]; 
put_bits (code , len, bs) ; 



/* generate code */ 



/* write code to bitstream */ 



return k; 

} 



• using w obtain its weight k, and index I n ^(w) 

• if In,k{w) < n-k use first subgroup W n< k,e otherwise pick W n> k,l+\ 

• retrieve base codeword and compute the code according to (|16|) . 

A complete C-language code of such a procedure is presented as Algorithm 1 
above. 

It can be seen that memory-wise this algorithm needs only S base codewords 
(O(n)-bit lonjjl), n+1 values (O(n)-bit long), S code lengths (O(logn)-bit long), 
and 2 (n + 1) subgroup indices (O(logn)-bit long). Given the fact that S = 0(n), 
the entire structure needs 0(n 2 ) bits. 

In a particular implementation shown in Algorithm 1, and assuming, e.g. that 
n = 20 and S = 32, the size of this structure becomes 244 bytes - far less than 2 20 
words needed to present this code in a form of a direct table. 

We note that for reasonably short blocks (e.g. n 12 . . . 16) computation of 
their weights and indices (functions weight (.) and index (. , .) in Algorithm 1), 
can be a matter of a single lookup, in which case, the entire encoding algorithm 
needs at most 1 comparison, 2 additions, and 4 lookups. 

5 We note that additional memory reduction is possible by storing incremental values of base 
codewords - this is discussed in a companion paper |32) . 



7 



For larger blocks, one can use the following well-known combinatorial formula 
(cf. [21], [2], [33], [6], [38], [H]): 

w«o=X>i( (18) 

where Wj represent individual bits of the word w, and it is assumed that (^) = 
for all k > n. In order to implement it, one could either pre-compute all binomial 
coefficients up to level n in Pascal's triangle, or compute them dynamically, using 
the following simple identities: 

(n — k\ k ( n\ (n — k\ n — k ( n\ 

{k-l)=n{k)> V k ) = —{k)- 

The implementation based on pre-computed coefficients requires 

^tll = o (n 2 ) 

words (0(n 3 ) bits) of memory, and O(n) additions. Dynamic computation of co- 
efficients will require O(n) additions, multiplications and divisions, but the entire 
process needs only few registers. Additional discussion on complexity of index com- 
putation can be found in [39] . 

We now turn our attention to the design of a decoder. Here, we will also need 
parameters rtfc, base codewords, and their lengths. For further convenience (as it 
was suggested by Moffat and Turpin [27] ) we will use left-justified versions of base 
values: 

< M = %2 W , (19) 

where T is the length of a machine word (T > max I). We will store such left- 
justified values in a lexicographically decreasing order. Then, the decoding process 
can be described as follows: 

• find first (top-most) subgroup with B 1 ^, , being less than last T bits in bit- 
stream, 

• decode index of a block I Ut k{w) (based on ([16]) ). and 

• produce reconstructed block using its weight k and index. 

A complete C-language code of such a procedure is presented as Algorithm 2. 

We note that (besides using left-justified base words) this algorithm has almost 
identical data structure. The only new elements here are weights k and subgroup 
level indicators j (j = if subgroup contains shorter codewords, and j = 1 other- 
wise). Memory-wise it has very similar characteristics. 

The main decoding process requires between 1 and S comparisons and lookups 
to find a subgroup, 1 or 2 additions, 1 shift, 1 extra comparison, and 3 extra lookups. 

As in Moffat-Turpin algorithm [27] the number of steps needed for finding a 
subgroup can be further reduced by placing base codewords in a binary search tree 
or using an extra lookup table, but in both cases we need to use extra memory to 
accomplish this. 

We note, that at the end of the decoding process we also need to convert word's 
weight k and index I n k(w) into its actual value (function word() in Algorithm 2). If 
blocks are reasonably short, this can be accomplished by a simple lookup. Otherwise, 
we can synthesize the word by using the enumeration formulae ()18p . Complexity- 
wise this process is similar to index computation in the encoder. 
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Algorithm 2 Decoding of a block codes. 



/* decoder structure: */ 
typedef struct { 

unsigned short nk[N+l]; /* # of elements in first (n,k) subgroup */ 

struct {unsigned char k:7,j:l;} kj [S] ; /* subgroup -> (k,j) mapping */ 
unsigned char len[S]; /* subgroup -> code length mapping */ 

unsigned int lj_base[S]; /* subgroup -> left-justified codewords */ 

} DEC; 

/* block decoder: */ 

unsigned block_dec (unsigned *w, DEC *dec, BITSTREAM *bs) 
{ 

unsigned i, j, k, len, val; 
val = bitstream_buf f er (bs) ; 

for (j=0; dec->lj_base [j] >val ; j++) ; /* find a subgroup */ 
len = dec->len[j]; 

scroll_bitstream(len, bs) ; /* skip decoded bits */ 

i = (val - dec->lj_base[j] ) » (32-len) ; 
k = dec->kj [j] . k; /* get weight */ 

j = dec->kj [j] . j ; /* get sub-group index */ 

if (j) /* reconstruct index */ 

i += dec->nk [k] ; 

*w = word(n,k,i); /* generate i-th word in (n,k) group */ 



return k; 



> 



5 Design of an Adaptive Block Coder 

Using above described algorithms we can now define a system for adaptive encod- 
ing/decoding of blocks of data. 

In this system, we assume that input blocks can be encoded under the following 
conditions: 

1. there is no context - i.e. we implement universal code, 

2. the context is given by one previously seen block - i.e. t = n, 

3. the context is given by two previously seen blocks - i.e. t = 2n. 

We note, that instead of using actual blocks as contexts it is sufficient (due to 
memoryless nature of the source) to use their weights. 

This means, that for t-bit samples, we will need to have an array of t + 1 code 
structures indexed by their weights s. To further save space, we can use symmetry 
of KT-distributions © with respect to s and k: replace s = t — s and flip bits (i.e. 
force k = n — k) every time when s > t/2. This way we will only need to define 
t/2 + 1 tables. 

Hence, the overall amount of memory needed by our adaptive code becomes 
1 + n/2 + 1 + n + 1 = 1.5 n + 3 tables. Specific memory estimates for block sizes 
n = 8 ... 20, are shown in Table 2. 
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Table 2: Memory usage estimates [in bytes] for different block sizes 



n 


maxi 


maxS 1 


Size of a single table 


Tables for all contexts 


8 


16 


14 


102 


1530 


12 


24 


19 


140 


2940 


16 


32 


25 


184 


4968 


20 


40 


29 


216 


7128 



In out test implementation we've generated all these tables using KT-estimated 
densities (j4|) and ((6j), and using modified Huffman code- construction algorithm, as 
described in Section 3. 

In Appendix B we provide a complete code of a program implementing such a 
system. 

6 Experimental Study of Performance of our Algorithm 

In this section we provide experimental results of evaluation of performance of our 
adaptive code with block size n = 16, and compare it with the following well known 
algorithms: 

• IBM's Q-coder algorithm [UJ adopted in JBIG standard for image compres- 
sion [12] (we've used implementation from JBIG's jbigkit); 

• CABAC binary arithmetic encoder [43j from MPEG AVC/ITU-T H.264 stan- 
dard for video coding 



In order to conduct our tests we've used computed-generated sequences of bits 
simulating output from a binary Bernoulli source with probability p. Lengths of such 
sequences ranged from 16 to 1024, and for each particular length we have generated 
Q = 1000000 samples of such sequences. 

Relative redundancy rates were computed as: 

(sum of lengths of all codes produced for Q sequences)/^ — H(p) 

= W) 

For our adaptive code we've used the following structure of contexts: 

• first 16-bit block is encoded without context (universal code), 

• second block is encoded using first one as its context (code with t = 16), 

• third and all subsequent blocks are encoded using two previous blocks in a 
sequence as contexts (sample-based code with t = 32). 

The results of our experimental study are shown in Figures 3 and 4. It can 
be seen that our code has a much faster rate of convergence than that of Q-coder 
or CABAC algorithms. It clearly outperforms them for short sequences, and be- 
comes comparable to the best of these two when the total length of encoded bits 
approaches 1024. 
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UJ 0.3- 



Redundancy rates under source with p=0.1 



4 Redundancy rates under source with p=0.5 



0.2- 



0.1 



UJ 0.3 



0.2 




200 



400 600 
# of bits encoded 



800 



1000 
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400 600 
# of bits encoded 



800 



1000 



Q-coder (JBIG) 
CABAC (H.264) 
Our code (n=16,t=0..32) 



Q-coder (JBIG) 
CABAC (H.264) 
Our code (n=16,t=0..32) 



Figure 3: Comparison of redundancy rates under memoryless sources with p = 0.1 
(left) and p = 0.5 (unbiased case, right). 
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Q-coder (JBIG) 
CABAC (H.264) 

Our adaptive code (n=1 6,t=0..32) 



Figure 4: Sensitivity of redundancy to asymmetry of the source. 



In Figure 4 we also show analysis of sensitivity of redundancy rates of these 
codes to asymmetry of the source. Here, after 160 encoded bits (or 10 16-bit blocks) 
our algorithm delivers much lower redundancy compared to others. Its behavior is 
consistent with one that was predicted by our Theorem 1. 
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A Proof of Theorem 1 



We need to evaluate average redundancy of an adaptive code 0* working with sam- 
ples of length t and blocks of length n produced by a binary memoryless source with 
parameter p: 

R K (n,t,p) = ^Yl E Pr(«)Pr(«O|0u(«O| ~ H(p) ■ (20) 

|«|=t |ui|=n 

We will further assume that each codeword (j) u (w) is generated on the basis 
of KT-estimated probability Pkt {w\u) ([6]), and therefore we can rewrite ([20]) as 
follows: 

R K (n,t,p) = ^ PT{u)PT(w)logP K 1 T (w\u)-H(p) + ^-A K (n,t,p) , (21) 

where by 



n A — ' 1 — ' n 

\u\=t \w\=n 



A K (n,t,p) = J2J2 Pr(«)PrH [\Mw)\+ log Pkt (w\u)} (22) 

|u|=t 1^1=71 

we denote the redundancy of code (/>* with respect to the distribution it implements. 

We know, that given density Pkt (w\u) most existing minimum redundancy 
block codes (such as block Huffman, or Shannon algorithms) produce codewords 
such that: 

[log Pkt Hu)J < \4> u {w)\ [logP^T M«)l > 
which implies that A^* (n,t,p) is a quantity of bounded magnitude: 

|A<^ (n,t,p)\ < 1 , 

and which might have some erratic or oscillating behavior (cf. [36], [9]). 
We now concentrate our attention on the main sum in (1211): 



— Pr(u) Pr(-u;) log Pkt (w\u) = 

\u\=t \w\=n 

= — Pr(u w) log Pkt (u w) + Pr(u) log Pkt (u) 

\uw\=t+n l u l = * 

= (t + n)C K T(t + n,p) -tC KT (t,p), (23) 

where 

C KT (n,p) = -- J2 Pr(«>) log Pxt H , (24) 

|«)|=n 

is the average rate of the KT-estimator processing n-symbols words produced by p. 

A.l Asymptotic average rates of empirical entropy and KT-estimator 

Consider KT-estimated probability of a word w 

r(fc + l/2) r(n-fc+l/2) 

Pkt (w) = — — — r . 

7rr (n + 1) 
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Using Stirling's approximation (and excluding cases when k = 0, n), we can show 
that: 

— log Pkt (w) = nF (w) H — log n-\ — log — h 



2 ° 2 °2 12 n 
1 1 / 1 1 \ 

+ 24~fc + 24 (n - fc) + VP + (n - A;) 3 J ' (25) 

where: 

s k , n — k, ( n — k\ 

F(w) = log I - ) log . (26) 

n \ra y n \ n J 

is an empirical entropy [131 121] of a word ui. 

The average rate of the empirical entropy F(w) under source p is: 

Y Pr(w)F(w) 

\w\=n 



£ 

k=0 



>^ p k q n-k 



k ( k\ n — k ( n — k 
log - H log 



n \n J n \ n 

n-l 



k=l v 7 fc=0 v 7 

n f n -l\ n_1 /n - 1\ 

logn-p Y L _ J^'V^log^) ~q Y ( fc JA' 

fc=l ^ 7 fc=o ^ 7 

logn-p/(n- l,/e,p) -qf(n- l,k,q). (27) 



k „n— 1— fc 



log (n — A;) 



where: 

f(n, k,0) = Y (?) <?*(! - ^)^ fc log (1 + fc) ■ (28) 

We immediately notice that (|28p belongs to a class of so-called binomial sums 
(see, e.g. [3], p. 92]), and that for large n we must have (a uniform in 6) convergence 
f(n, k, 6) — > log (1 + 6 n). However, in order to obtain a more detailed asymptotic of 
f(n,k,6), one must use analytic techniques, such as analytic depoissonization [I~9| 
120] . or singularity analysis of generating functions |15j . 

In fact, the last approach in application to a class of poly logarithmic Bernoulli 
sums has already been used by P. Flajolet [14] . and in particular, he has shown that 

AAA k k e-i e 2 -6e + 5 

Y[ k y(l-0r k lo g k = lo g (8n) + — - ue2n2 +Q^j , 

which is a very similar sum to one that we need to evaluate (|28p . To take advantage 
of this existing result, we simply replace log(l + k) in ([28]) with: 

log(*+ 1) = log(Jb) + ^ + ^ (fc + 1) 1 (fc + 2) +^ (fc + 1)(fc | 2)(fc + 3) + • • • • (29) 
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The evaluation of the sums containing factorial powers of k yields: 
Si(n,e) 



yn-k 1 



fc + 1 



1 [\-{i-e) n -dn{i-e) n ] 



0(n + l 

^-5^ + 0^1. (*>) 



m-Jfc 1 



(fc + l)(fc + 2) 



1 



6> 2 (n + 1) (n + 2) 



J2 

l-(l-#) n -0n(l-0) 



i- g r- g2n(n + 1) q 



! +o(^), (3D 



# 2 n 2 \ra 3 

and it is clear that the contribution of the subsequent terms in (|29p to the sum (|28p 
is within O (^)- 

Combining the above formulae, we obtain 

/ (n , t , S )=lo g ( 9 „) + l±i-^±^ + (-L), (32) 

and subsequently (after plugging ([32]) in (f2"T|) and some simple algebra): 

Y, Pr(w)F(w) = H{p) - J- + J?^l + O f -1) , (33) 

\w\=n 

which is an (up to O (^)-term) accurate asymptotic expression for the average rate 
of empirical entropy. 

Remark 1 The average rate of empirical entropy has already been studied by Krichevsky 
(cf. f24[ Lemma 1] and \2h\ Lemma 3.2.1]). His conclusion was that: 



— < Pr ( w ) F H - h (p) < • 



n 

\w\=n 



Our formula 133\) confirms and refines this statement. 

We now focus our attention on the average rate of the KT-estimator (|24p . Us- 
ing our asymptotic expression (f25j) and replacing \ and with the appropriate 
factorial powers we can show that: 



C K T(n,p) = V Pr(w) log P K t (w) 

n 

\w\=n 

E l f 1 1 7T 1 

Pr(iu)F(«;) + - < - log n + - log - + — 



[w|=n 



77 



+ ^ [Si(n,p) + S 2 (n,p)} + — [Si(n, q) + S 2 (n, q)] 
+ o(X), (34) 
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where Si(n,0) and S2(n,9) are already familiar sums (|30|) and ([3T|) . 

Now by using f)33[) and expanding all the expressions in (|34p we finally obtain: 

CKT(n,p) = H(jp) + — <^ log n + log - - 1 - — h 



2n { 2 12 pqn 12 p 2 q 2 n 2 

+0 1 ■ < 35 > 

A. 2 Asymptotic average rate of the adaptive block code 

Using (|21H24p we can now say that 

#0* (M,Z>) = - [(* + n)C K T(t + n,p) -tC K T{t,p) -nH(p) + A^* (n, t,p)} , 

(36) 

where Ckt{p>,P) is the average rate of the KT-estimator (|24|) . 

By applying our asymptotic result ([35]) for Ckt(p>-,P) and combining the remain- 
ing (after some cancellations) terms we arrive at 

1 f 1 t + n l — ^pq n 



_ , x lflt + n . . 

R<t>l(.n,t,p) = -<-log—j— + A ru {n,t,p) + 



2Apq t(t + n) 
l-3pq (n + 2t)n (1 | J_ 
24 pV t 2 (t + n) 2 V* 3 " 3 



which is formula (|8|) claimed by our theorem. 



B Example implementation of adaptive block coder 

/■* bitstream.h: */ 

typedef struct .BITSTREAM BITSTREAM; 

void bitstream_open(BITSTREAM *p, unsigned char *pbs , unsigned bit_offset, int read) ; 

void bitstream_close (BITSTREAM *p , unsigned char **p_pbs , unsigned *p_bit_of f set , int write) ; 

void put .bits (unsigned bits, int len, BITSTREAM *p) ; 

unsigned bitstreamJaufferCBI STREAM *p) ; 

void scroll_bitstream(int len, BITSTREAM *p) ; 



/* blade. h: */ 

/* encoder functions: */ 
void blade_enc_init (void) ; 

unsigned blade.enc.O (unsigned block, BITSTREAM *bs) ; 

unsigned blade_enc_l (unsigned block, unsigned cx, BITSTREAM *bs) ; 

unsigned blade_enc_2 (unsigned block, unsigned cxl , unsigned cx2, BITSTREAM *bs) ; 

/* decoder functions: */ 
void blade_dec_init (void) ; 

unsigned blade_dec_0(unsigned *block, BITSTREAM *bs) ; 

unsigned blade_dec.l (unsigned *block, unsigned cx, BITSTREAM *bs) ; 

unsigned blade_dec_2 (unsigned *block, unsigned cxi , unsigned cx2 , BITSTREAM *bs) ; 



/* blade_12 . c : implements 12-bit BLADE encoder /decoder */ 



#define N 12 
#define SGS 19 



/* block size */ 

/* max # of subgroups */ 



/* encoder structure : */ 
typedef struct { 

unsigned short nk [N+l] ; 

unsigned char len [SGS] ; 

unsigned char sg [N+l] [2] ; 

unsigned int base [SGS] ; 
} BLADE.ENC; 



/* # of elements in first (n,k) subgroup 
/* subgroup -> code length mapping */ 
/* (k,j) -> subgroup index mapping */ 
/* subgroup -> base codeword mapping */ 



/* w -> (k, index) mapping: */ 
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static struct {unsigned short k:4, i:12;} w_ki[l«N] ; 



/* 

* BLADE encoder: 

* Returns: 

* # of bits set in encoded pattern 

*/ 

unsigned blade^enc (unsigned w, BLADE.ENC *enc, BITSTREAM *bs) 
{ 

unsigned i, j , k, len, code; 

k = w_ki [w] .k; 
i = w_ki [w] . i ; 
if (i >= enc->nk[k] ) { 

i -= enc->nk[k] ; 

j = enc->sg[k] [1] ; 
} else 

j = enc->sg[k] [0] ; 
code = enc->base[j] + i; 
len = enc->len[j] ; 
put_bits (code , len, bs) ; 

return k; 



/* split w into (k, index) */ 

/* find subgroup containing w */ 
/* adjust index */ 



/* generate code */ 



/* decoder structure : */ 
typedef struct { 

unsigned int sgs; /* number of subgroups */ 

unsigned short nk [N+l] ; /* # of elements in first (n,k) subgroup */ 

unsigned char len [SGS] ; /* subgroup -> code length mapping */ 

struct {unsigned char k:7,j:l;} kj [SGS] ; /* subgroup -> (k,j) mapping */ 
unsigned int lj_base [SGS] ; /* subgroup -> left- justified codewords */ 
> BLADE_DEC; 

/* (k, index) -> w mapping:*/ 

static unsigned short *ki_w[N+l] , _w[K<N] ; 



* BLADE decoder: 

* Returns: 

* # of bits set in encoded pattern 

*/ 

unsigned blade_dec (unsigned *w, BLADE_DEC *dec, BITSTREAM *bs) 

{ 

unsigned i, j , k, len, val; 
val = bitstream_buf f er(bs) ; 

for (j=0 ; j<dec->sgs ; j++) /* find subgroup */ 

if (dec->lj_base [j] <= val) 
break; 
len = dec->len[j]; 

scroll_bitstream(len, bs) ; /* skip decoded bits */ 

i - (val - dec->lj_base [j] ) » (32-len) ; 
k - dec->kj [j] .k; 
j - dec->kj [j] .j ; 

if ( j ) /* convert to (n , k) -group ' s index */ 

i += dec->nk[k] ; 

*w = ki_w[k][i]; /* produce reconstructed block */ 



return k; 



* Pre-computed BLADE decoder tables: 

»/ 

static BLADE.DEC dec_t [l+(K/2+l) + (K+D] - { 
{ /* no context/ universal code: */ 15, 

{1,12, 66, 92, 495, 792, 924, 792, 495, 122, 66, 12, 1>, <3, 3, 7 ,7, 10, 10, 11 , 11 , 12, 12, 13, 13, 14, 14, 14}, 
{{0, 0}, {12, 0}, {1, 0}, {11, 0}, {2,0}, {10,0}, {3, 0>, {9, 0>, {3,1}, {9,1}, {4,0}, {8,0}, {5,0}, {6,0}, {7,0}}, 
{0XE0000000 , 0XC0000000 , 0xA8000000 , 0x90000000 , 0x7F800000 , 0x6F000000 , 0x63800000 , 0x54400000 , 
0x40400000 , 0x46200000 , 0x36A80000 , 0x27300000 , 0X1AD00000 , 0x00600000 , 0x00000000} } , 
{ /* (12,0): »/ 17, 

{1,8,66,64,495,792,924,792,334,220,66,11,1}, {1,5,6,9,12,13,15,17,19,20,21,22,22,23,23,24,24}, 

{{0,0}, {1,0}, {1,1}, {2,0}, {3,0}, {3,1}, {4,0}, {5,0}, {6,0}, {7,0}, {8,0}, {8,1}, {9,0}, {10,0}, {11,0}, {11,1}, {12,0}}, 

{0x80000000 , 0x40000000 , 0x30000000 , 0X0F000000 , 0X0B000000 , 0x06200000 , 0x02420000 , 0x00860000 , 

0x00428000 , 0x001 10000 , 0x00069000 , 0x00040000 , 0x00009C00 , 0x00001800 , 0x00000200 , 0x00000100 , 0x00000000} } , 

{ /» (12,1): */ 16, 

{1,12,17,220,495,792,924,340,495,220,66,10,1}, {2,5,8,9,11,13,15,16,17,18,18,19,19,19,19,20}, 

{{0,0}, {1,0}, {2,0}, {2,1}, {3,0}, {4,0}, {5,0}, {6,0}, {7,0}, {7,1}, {8,0}, {9,0}, {10,0}, {11,0}, {12,0}, {11,1}}, 

{0XC0000000 , 0x60000000 , 0x4F000000 , 0x36800000 , 0X1B000000 , 0x0B880000 , 0x05580000 , 0X01BC0000 , 

0x01 120000 , 0X00A10000 , 0x00254000 , 0x0009C000 , 0x00018000 , 0x00004000 , 0x00002000 , 0x00000000} } , 

{ /* (12,2): */ 15, 

{1,12,66,211,495,792,924,792,486,220,66,12,1}, {3,6,8,10,11,12,14,15,16,16,17,17,17,17,17}, 
{{0,0}, {1,0}, {2,0}, {3,0}, {3,1}, {4,0}, {5,0}, {6,0}, {7,0}, {8,0}, {8,1}, {9,0}, {10,0}, {11,0}, {12,0}}, 
{OxEOOOOOOO , OxBOOOOOOO , 0x6E000000 , 0x39400000 , 0x38200000 , 0x19300000 , OxOCDOOOOO , 0x05980000 , 
0x02800000 , 0X009A0000 , 0x00958000 , 0x00278000 , 0x00068000 , 0x00008000 , 0x00000000} } , 
{ /» (12,3): */ 16, 

{1,12,30,220,495,792,924,792,19,220,6,12,1}, {4,6,8,9,10,12,13,14,14,14,14,14,14,15,15,15}, 

{{0,0}, {1,0}, {2,0}, {2,1}, {3,0}, {4,0}, {5,0}, {6,0}, {7,0}, {8,0}, {10,0}, {11,0}, {12,0}, {8,1}, {10,1}, {9,0}}, 

{OxFOOOOOOO , OxCOOOOOOO , 0xA2000000 , 0x90000000 , 0x59000000 , 0x3A100000 , 0x21500000 , 0xl2E00000 , 

0x06800000 , 0x06340000 , 0x06100000 , 0x05EC0000 , 0x05E80000 , 0x02300000 , 0x01B80000 , 0x00000000} } , 

{ /» (12,4): */ 16, 
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{1,12,66,220,495,303,924,792,495,219,66,4,1}, {5,7,9,10,12,12,12,12,13,13,13,13,13,13,14,14}, 

{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {5,0}, {11,0}, {12,0}, {5,1}, {11,1}, {6,0}, {7,0}, {9,0}, {10,0}, {9,1}, {8,0}}, 

{0xF8000000 , OxEOOOOOOO , OxBFOOOOOO , 0x88000000 , 0x69100000 , 0x56200000 , 0x55E00000 , 0x55D00000 , 

0x46880000 , 0x46480000 , 0x29680000 , 0xl0A80000 , 0x09D00000 , 0x07C00000 , 0x07BC0000 , 0x00000000} } , 

{ /» (12,5): */ 15, 

{1,12,66,220,495,792,509,792,350,220,66,12,1}, {6,8,10,10,11,11,12,12,12,12,12,12,13,13,13}, 
{{0,0}, {1,0}, {2,0}, {12,0}, {3,0}, {11,0}, {4,0}, {5,0}, {6,0}, {8,0}, {9,0}, {10,0}, {6,1}, {8,1}, {7,0}}, 
{OxFCOOOOOO , OxFOOOOOOO , 0xDF800000 , 0xDF400000 , 0xC3C00000 , 0xC2400000 , 0xA3500000 , 0x71D00000 , 
0x52000000 , 0X3C200000 , 0x2E600000 , 0x2A400000 , 0xlD480000 , 0xl8C00000 , 0x00000000} } , 
{ /* (12,6): */ 15, 

{1,12,66,47,495,792,924,792,495,85,66,12,1}, {8,8,9,9,11,11,11,11,12,12,12,12,12,12,13}, 
{{0,0}, {12,0}, {1,0}, {11,0}, {2,0}, {3,0}, {9,0}, {10,0}, {3,1}, {9,1}, {4,0}, {5,0}, {7,0}, {8,0}, {6,0}}, 
{OxFFOOOOOO , OxFEOOOOOO , 0xF8000000 , 0xF2000000 , 0xE9C00000 , 0xE3E00000 , 0xD9400000 , OxDIOOOOOO , 
0XC6300000 , OxBDCOOOOO , 0x9ED00000 , 0x6D500000 , 0x3BD00000 , OxlCEOOOOO , 0x00000000} } , 
{ /» (24,0) : */ 19, 

{1,12,25,220,487,791,924,787,494,220,66,11,1}, {1,5,9,10,13,16,17,19,20,22,24,25,26,27,28,30,31,32,32}, 

{{0,0}, {1,0}, {2,0}, {2,1}, {3,0}, {4,0}, {4,1}, {5,0}, {5,1}, {6,0}, {7,0}, {7,1}, {8,0}, {8,1}, {9,0}, {10,0}, {11,0}, {11,1}, {12,0}}, 

{0x80000000,0x20000000,0x13800000,0x09400000,0x02600000,0x00790000,0x00750000,0x00122000,0x00121000, 

0X0003A000 , 0X00008D00 , 0x00008A80 , OxOOOOOFOO , OxOOOOOEEO , 0x00000120 , 0x00000018 , 0x00000002 , 0x00000001 , 0x00000000} } , 

{ /» (24,1): */ 17, 

{1,7,66,220,495,326,924,792,495,4,66,11,1}, {1,5,6,9,12,15,17,18,20,22,23,24,25,26,27,28,28}, 

{{0,0}, {1,0}, {1,1}, {2,0}, {3,0}, {4,0}, {5,0}, {5,1}, {6,0}, {7,0}, {8,0}, {9,0}, {9,1}, {10,0}, {11,0}, {11,1}, {12,0}}, 

{0x80000000 , 0x48000000 , 0x34000000 , 0x13000000 , 0x05400000 , 0x01620000 , OxOOBFOOOO , 0x004A8000 , 

OxOOlOCOOO , 0x00046000 , 0x00008200 , 0x00007E00 , 0x00001200 , 0x00000180 , 0x00000020 , 0x00000010 , 0x00000000} } , 

{ /» (24,2) : */ 17, 

{1,12,47,220,495,792,924,1,495,220,58,11,1}, {2,5,8,9,11,14,16,18,19,20,21,22,23,24,24,25,25}, 

{{0,0}, {1,0}, {2,0}, {2,1}, {3,0}, {4,0}, {5,0}, {6,0}, {7,0}, {7,1}, {8,0}, {9,0}, {10,0}, {10,1}, {11,0}, {11,1}, {12,0}}, 

{OxCOOOOOOO , 0x60000000 , 0x31000000 , 0x27800000 , OxOCOOOOOO , 0x04440000 , 0x012C0000 , 0x00450000 , 

0X0044E000 , 0x00137000 , 0x0003F800 , 0x00008800 , 0x00001400 , OxOOOOOCOO , 0x00000100 , 0x00000080 , 0x00000000} } , 

{ /» (24,3): »/ 17, 

{1,6,66,1,495,4,924,792,495,220,66,7,1}, {2,5,6,8,10,11,13,14,15,16,18,19,20,21,21,22,22}, 

{{0,0}, {1,0}, {1,1}, {2,0}, {3,0}, {3,1}, {4,0}, {5,0}, {5,1}, {6,0}, {7,0}, {8,0}, {9,0}, {10,0}, {11,0}, {11,1}, {12,0}}, 

{OxCOOOOOOO , 0x90000000 , 0x78000000 , 0x36000000 , 0x35000000 , 0xlA600000 , 0x0AE80000 , 0x0AD80000 , 

0X04B00000 , 0x01140000 , 0x004E0000 , 0x00102000 , 0x00026000 , 0x00005000 , 0x00001800 , 0x00000400 , 0x00000000} } , 

{ /» (24,4): */ 15, 

{1,12,66,220,495,10,924,792,495,220,66,7,1}, {3,6,8,10,12,13,14,15,16,17,18,19,19,20,20}, 
{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {5,0}, {5,1}, {6,0}, {7,0}, {8,0}, {9,0}, {10,0}, {11,0}, {11,1}, {12,0}}, 
{OxEOOOOOOO , OxBOOOOOOO , 0x6E000000 , 0x37000000 , 0x18100000 , 0x17000000 , 0x0B880000 , 0x04500000 , 
0x01380000 , 0x00408000 , 0x00098000 , 0x00014000 , 0x00006000 , 0x00001000 , 0x00000000} } , 
{ /» (24,5): */ 16, 

{1,12,66,220,495,792,451,792,2,220,66,11,1}, {4,6,8,10,12,13,14,15,16,16,17,17,18,18,19,19}, 

{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {5,0}, {6,0}, {6,1}, {7,0}, {8,0}, {8,1}, {9,0}, {10,0}, {11,0}, {11,1}, {12,0}}, 

{OxFOOOOOOO , OxCOOOOOOO , 0x7E000000 , 0x47000000 , 0x28100000 , OxOF500000 , 0x08440000 , 0x04920000 , 

0X017A0000 , 0x01780000 , 0x00818000 , 0x00138000 , 0x00030000 , 0x00004000 , 0x00002000 , 0x00000000} } , 

{ /» (24,6) : »/ 17, 

{1,8,65,220,2,792,924,792,495,220,59,12,1}, {4,6,7,8,9,10,11,12,13,14,15,16,16,16,17,17,17}, 

{{0,0}, {1,0}, {1,1}, {2,0}, {2,1}, {3,0}, {4,0}, {4,1}, {5,0}, {6,0}, {7,0}, {8,0}, {9,0}, {10,0}, {10,1}, {11,0}, {12,0}}, 

{OxFOOOOOOO , OxDOOOOOOO , 0xC8000000 , 0x87000000 , 0x86800000 , 0x4F800000 , 0x4F400000 , 0x30700000 , 

0X17B00000 , 0x09400000 , 0x03100000 , 0x01210000 , 0x00450000 , OxOOOAOOOO , 0x00068000 , 0x00008000 , 0x00000000} } , 

{ /» (24,7): */ 15, 

{1,12,66,220,495,62,924,792,495,220,66,8,1}, {5,7,9,10,11,12,13,13,14,15,15,15,15,15,16}, 
{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {5,0}, {5,1}, {6,0}, {7,0}, {8,0}, {9,0}, {10,0}, {11,0}, {12,0}, {11,1}}, 
{0XF8000000 , OxEOOOOOOO , OxBFOOOOOO , 0x88000000 , 0x4A200000 , 0x46400000 , 0x2F700000 , 0x12900000 , 
0x06300000 , 0x02520000 , Ox009AOOOO , 0x00160000 , 0x00060000 , 0x00040000 , 0x00000000} } , 
{ /» (24,8): */ 15, 

{1,12,66,220,287,792,924,792,495,220,62,12,1}, {6,8,9,10,11,12,12,13,14,14,14,14,14,14,15}, 
{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {4,1}, {5,0}, {6,0}, {7,0}, {8,0}, {9,0}, {10,0}, {11,0}, {12,0}, {10,1}}, 
{OxFCOOOOOO , OxFOOOOOOO , OxCFOOOOOO , 0x98000000 , 0x74200000 , 0x67200000 , 0x35A00000 , 0xl8C00000 , 
0X0C600000 , 0X04A40000 , 0x01340000 , 0x003C0000 , OxOOOCOOOO , 0x00080000 , 0x00000000} } , 
{ /» (24,9): */ 14, 

{1,12,66,220,417,792,924,792,495,220,66,12,1}, {7,8,9,11,11,12,12,13,13,13,13,13,13,14}, 
{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {4,1}, {5,0}, {6,0}, {7,0}, {8,0}, {10,0}, {11,0}, {12,0}, {9,0}}, 
{OxFEOOOOOO , 0XF2000000 , OxDIOOOOOO , 0xB5800000 , 0x81600000 , 0x7C800000 , 0x4B000000 , 0x2E200000 , 
0x15600000 , 0X05E80000 , 0x03D80000 , 0x03780000 , 0x03700000 , 0x00000000} } , 
{ /* (24,10): */ 15, 

{1,12,66,220,221,792,923,792,495,220,66,12,1}, {7,9,10,11,11,12,12,12,12,12,12,13,13,13,13}, 
{{0,0}, {1,0}, {2,0}, {3,0}, {4,0}, {4,1}, {5,0}, {6,0}, {10,0}, {11,0}, {12,0}, {6,1}, {7,0}, {8,0}, {9,0}}, 
{OxFEOOOOOO , 0XF8000000 , 0xE7800000 , OxCCOOOOOO , 0xB0600000 , 0x9F400000 , Ox6DCOOOOO , 0x34100000 , 
0X2FF00000 , 0X2F300000 , 0x2F200000 , Ox2F180000 , 0x16580000 , 0x06E00000 , 0x00000000} } , 
{ /» (24,11) : */ 14, 

{1,12,23,220,495,792,924,792,495,220,66,12,1}, {8,10,10,11,11,11,11,12,12,12,12,12,12,13}, 
{{0,0}, {1,0}, {2,0}, {2,1}, {3,0}, {11,0}, {12,0}, {4,0}, {5,0}, {6,0}, {8,0}, {9,0}, {10,0}, {7,0}}, 
{OxFFOOOOOO , OxFCOOOOOO , 0xF6400000 , OxFOEOOOOO , OxD5600000 , 0xD3E00000 , 0xD3C00000 , 0xB4D00000 , 
0x83500000 , 0x49900000 , 0x2AA00000 , OxlCEOOOOO , 0xl8C00000 , 0x00000000} } , 
{ /* (24,12) : */ 14, 

{1,12,66,220,495,792,504,792,495,220,66,12,1}, {10,10,10,10,11,11,12,12,12,12,12,12,12,13}, 
{{0,0}, {1,0}, {11,0}, {12,0}, {2,0}, {10,0}, {3,0}, {4,0}, {5,0}, {6,0}, {7,0}, {8,0}, {9,0}, {6,1}}, 
{OxFFCOOOOO , OxFCCOOOOO , OxF9COOOOO , 0xF9800000 , OxFUOOOOO , 0xE9000000 , 0xDB400000 , OxBC500000 , 
0X8AD00000 , 0X6B500000 , 0x39D00000 , Oxl AEOOOOO , 0x0D200000 , 0x00000000} } 



/* encoder tables (computed using decoder's tables): */ 
static BLADE.ENC enc.t [l+(N/2+l) + (N+D] ; 

/* initialize encoder: */ 
void blade_enc_init 

{ 

unsigned int i [N+l] , j , k, 1 , w; 
/* init enc [] : */ 

for (j=0; j<l+(N/2+l)+(N+l) ; j++) { 

for (k=0; k<=N; k++) enc.t [j] .nk [k] = dec.t [j] .nk [k] ; 
for (k=0; k<=SGS; k++) { 
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enc_t[j] .sg[dec.t[j] .kj [k] .k] [dec.t[j] .kj [k] .J] = j; 
enc_t [j] . jen [k] = dec_t [j] . jen [k] ; 

enc_t [j] .base [k] = dec_t [j] . j j^base [k] » C32-dec_t [j] . jen[k] ) ; 

> 

> 

/* init w.ki [] : */ 

for (j=0; k<-N; k++) i [k] - 0; 

for Cw=0; w<Cl«N); w++) { 

for (k=0,j-0; j<N; if (w & Cl«j)) k++; 

u_ki [w] .k = k; 

w_ki [w] . i = i [k] ; 

i[k] ++; 

} 



/* initialize decoder : */ 
void blade_dec_init () 

{ 

static short bCK+l] - {1,12,66,220,495,792,924,792,495,220,66,12,1}; 
unsigned int i[N+l], j, k, w; 
/* init ki_w[] : */ 

for Cj-0,k-0; k<-N; j+-b [k] ,k++) {ki_»[k] - _w + j ; i [k] - 0;> 
for (w=0; w<Cl«N); w++) { 

for (k-O.j-0; j<N; j++) if Cw t Cl«j)) k++; 

ki_w[k] [i[k]] - y; 

iW ++; 

} 

} 

/* encoder' s functions : */ 

unsigned blade_enc_0 (unsigned w, BITSTREAM *bs) 
{ 

return blade_enc (w, enc^t + 0, bs) ; 

} 

unsigned blade_enc_l (unsigned w, unsigned cx, BITSTREAM *bs) 

{ 

unsigned r; 
if (cx > N/2) 

r = N - blade_enc (w ™((1«N)-1), enc.t + 1 + N - cx, bs) ; 
else 

r = blade_enc (w, enc_t + 1 + cx, bs) ; 
return r; 



unsigned blade_enc_2 (unsigned w, unsigned cxl, unsigned cx2, BITSTREAM *bs) 

{ 

unsigned cx = cxl + cx2, r; 
if (cx > N) 

r = N - blade_enc (w "((1«N)-1), enc_t + 1 + (N/2 + 1) + 2*N - cx, bs) ; 
else 

r = blade.enc (w, enc_t + 1 + (N/2 + 1) + cx, bs) ; 
return r; 



/* decoder' s functions : */ 

unsigned blade_dec_0 (unsigned *w, BITSTREAM *bs) 
{ 

return blade^dec (w, dec_t + 0, bs) ; 

} 

unsigned blade_dec_l (unsigned *w, unsigned cx, BITSTREAM *bs) 

{ 

unsigned b, r; 
if (cx > N/2) { 

r = N - blade_dec (&b, dec_t + 1 + N - cx, bs); 

b ~= (1«N)-1; 
} else 

r = blade_dec (&b, dec_t + 1 + cx, bs) ; 
*w = b; 
return r; 

} 

unsigned blade_dec_2 (unsigned *w, unsigned cxl , unsigned cx2 , BITSTREAM *bs) 

{ 

unsigned cx = cxl + cx2, b, r; 
if (cx > N) { 

r = N - blade_dec (&b, enc_t + 1 + (N/2 + 1) + N*2 - cx, bs) ; 
b *- (1«N)-1; 
} else 

r = blade_dec (&b, enc_t + 1 + (N/2 + 1) + cx, bs) ; 
*w = b; 
return r; 



/* main.c - test program and demo: */ 

#define M 1000 /* max # of blocks in test sequence */ 
#define Q 1000000 /* # of iterations */ 

/* test program: */ 
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int main 

{ 

/* in/ out buffers : */ 

static unsigned char in_buff er [M*N/8] ; 

static unsigned char out.buff er [M*N/8 + 1024] ; 

static BITSTREAM in, out; 

/* vars: */ 

unsigned char *pbs; int bit_offset; 
unsigned int w, cx, cxl = 0, cx2 = 0; 
int i , j , k , m ; 
double p, h, c; 

/* init BLADE- 12 library: */ 
blade_init ; 

/* scan sources: */ 

for (p=0.01; p<=0.991; p+=0.01) { 

/* estimate entropy : */ 

h - - (p * logCp) + (l.-p) * log(l.-p)) / log(2.); 
printf ( " \np=7,g , h='/„g\n" , p , h) ; 

/* try different # of blocks: */ 
for (m=l; m<M; m++) 

■c 

c = 0. ; 

/* reset generator: */ 
srand(l) ; 

/* make Q runs: */ 
for (i=0; i<Q; i++) { 

/* generate test sequence: */ 
memset (in_buf f er , 0, sizeof in.buf f er) ; 
bitstream_open (&in, in_buf f er , 0, 0) ; 
for Cj=0; j<N*m; { 

/* get a next bit from a pseudo-Bernoulli source: */ 

k - ((double) rand() / (double) RAND_MAX) > (1. - p) ; 

/* insert it in bitstream: */ 

put_bits(k, 1, &in) ; 

} 

bit stream_ close (&in, &pbs , &bit_of f set , 1) ; 
/ * start encoding : */ 

memset (out_buf f er , , sizeof out_buf f er) ; 
bitstream_open (&out , out_buf f er , , 0) ; 
bitstream_open(&in, in_buf f er , 0, 1) ; 

/* run the encoder: */ 
for (j=0; j<m; j++) { 

/* block to be encoded: */ 

w = (unsigned)get_bits (N, &in) ; 

/* choose context and encode: */ 

if (j -- 0) 

cxl = blade_enc_0 (w, &out) ; /* no context */ 

else if ( j == 1) 

cx2 = blade_enc_l (w, cxl, &out) ; /* use cxl */ 

else { 

cx = blade_enc_2 (w, cxl, cx2, &out) ; /* use cxl and cx2 */ 
/* scroll contexts : */ 
cxl - cx2; 
cx2 = cx; 

> 

/* close bitstreams: */ 

bitstream_close (&in, &pbs , &bit_of f set , 0) ; 
bitstream_close (&out , &pbs , &bit_of f set , 1) ; 

/* compute coding cost : */ 

c += (double) ((pbs - out_buffer) * 8 + bit_offset) / (double) (m*N) ; 
/* start decoding: */ 

bitstream_open (&in, in_buf f er , , 1) ; 
bitstream_open (&out , out_buffer, 0, 1) ; 

/* run the decoder: */ 
for (j=0; j<m; j++) { 

/* choose the context and decode: */ 

if (j -- 0) 

cxl = blade_dec_0 (&w, &out) ; /* no context */ 

else if (j == 1) 

cx2 = blade^dec^l (&w, cxl, &out) ; /* use cxl */ 

else { 

cx = blade_dec_2 (&w, cxl, cx2, &out) ; /* use cxl and cx2 */ 
/* scroll contexts : */ 
cxl - cx2; 
cx2 = cx; 

} 

/* compare with the original block: */ 
if (w != get_bits (N, &in) ) { 
printf ("?"/„d,", j); 

> 
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} 

/* close bitstreams: */ 

bit stream_ close (&in, &pbs , &bit_of f set , 
bitstream_close C&out , &pbs , &bit_of f set , 



/* print results: */ 
c /= (double) Q; 

printf (" [*/.d,y.g] , ", m*N, (c-h)/h); 
f f lushCstdout) ; 

> 

return 1; 



